166 research outputs found

    A dynamic network approach for the study of human phenotypes

    Get PDF
    The use of networks to integrate different genetic, proteomic, and metabolic datasets has been proposed as a viable path toward elucidating the origins of specific diseases. Here we introduce a new phenotypic database summarizing correlations obtained from the disease history of more than 30 million patients in a Phenotypic Disease Network (PDN). We present evidence that the structure of the PDN is relevant to the understanding of illness progression by showing that (1) patients develop diseases close in the network to those they already have; (2) the progression of disease along the links of the network is different for patients of different genders and ethnicities; (3) patients diagnosed with diseases which are more highly connected in the PDN tend to die sooner than those affected by less connected diseases; and (4) diseases that tend to be preceded by others in the PDN tend to be more connected than diseases that precede other illnesses, and are associated with higher degrees of mortality. Our findings show that disease progression can be represented and studied using network methods, offering the potential to enhance our understanding of the origin and evolution of human diseases. The dataset introduced here, released concurrently with this publication, represents the largest relational phenotypic resource publicly available to the research community.Comment: 28 pages (double space), 6 figure

    Mapping gene associations in human mitochondria using clinical disease phenotypes

    Get PDF
    Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes

    The Evolving Transcriptome of Head and Neck Squamous Cell Carcinoma: A Systematic Review

    Get PDF
    BACKGROUND: Numerous studies were performed to illuminate mechanisms of tumorigenesis and metastases from gene expression profiles of Head and Neck Squamous Cell Carcinoma (HNSCC). The objective of this review is to conduct a network-based meta-analysis to identify the underlying biological signatures of the HNSCC transcriptome. METHODS AND FINDINGS: We included 63 HNSCC transcriptomic studies into three specific categories of comparisons: Pre, premalignant lesions v.s. normal; TvN, primary tumors v.s. normal; and Meta, metastatic or invasive v.s. primary tumors. Reported genes extracted from the literature were systematically analyzed. Participation of differential gene activities across three progressive stages deciphered the evolving nature of HNSCC. In total, 1442 genes were verified, i.e. reported at least twice, with ECM1, EMP1, CXCL10 and POSTN shown to be highly reported across all three stages. Knowledge-based networks of the HNSCC transcriptome were constructed, demonstrating integrin signaling and antigen presentation pathways as highly enriched. Notably, functional estimates derived from topological characteristics of integrin signaling networks identified such important genes as ITGA3 and ITGA5, which were supported by findings of invasiveness in vitro. Moreover, we computed genome-wide probabilities of reporting differential gene activities for the Pre, TvN, and Meta stages, respectively. Results highlighted chromosomal regions of 6p21, 19p13 and 19q13, where genomic alterations were shown to be correlated with the nodal status of HNSCC. CONCLUSIONS: By means of a systems-biology approach via network-based meta-analyses, we provided a deeper insight into the evolving nature of the HNSCC transcriptome. Enriched canonical signaling pathways, hot-spots of transcriptional profiles across the genome, as well as topologically significant genes derived from network analyses were highlighted for each of the three progressive stages, Pre, TvN, and Meta, respectively

    Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions.</p> <p>Results</p> <p>In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification.</p> <p>Conclusion</p> <p>High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.</p

    Principal components analysis based methodology to identify differentially expressed genes in time-course microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Time-course microarray experiments are being increasingly used to characterize dynamic biological processes. In these experiments, the goal is to identify genes differentially expressed in time-course data, measured between different biological conditions. These differentially expressed genes can reveal the changes in biological process due to the change in condition which is essential to understand differences in dynamics.</p> <p>Results</p> <p>In this paper, we propose a novel method for finding differentially expressed genes in time-course data and across biological conditions (say <it>C</it><sub>1 </sub>and <it>C</it><sub>2</sub>). We model the expression at <it>C</it><sub>1 </sub>using Principal Component Analysis and represent the expression profile of each gene as a linear combination of the dominant Principal Components (PCs). Then the expression data from <it>C</it><sub>2 </sub>is projected on the developed PCA model and scores are extracted. The difference between the scores is evaluated using a hypothesis test to quantify the significance of differential expression. We evaluate the proposed method to understand differences in two case studies (1) the heat shock response of wild-type and HSF1 knockout mice, and (2) cell-cycle between wild-type and Fkh1/Fkh2 knockout Yeast strains.</p> <p>Conclusion</p> <p>In both cases, the proposed method identified biologically significant genes.</p

    Gene Expression Profiles Characterize Inflammation Stages in the Acute Lung Injury in Mice

    Get PDF
    Acute Lung Injury (ALI) carries about 50 percent mortality and is frequently associated with an infection (sepsis). Life-support treatment with mechanical ventilation rescues many patients, although superimposed infection or multiple organ failure can result in death. The outcome of a patient developing sepsis depends on two factors: the infection and the pre-existing inflammation. In this study, we described each stage of the inflammation process using a transcriptional approach and an animal model. Female C57BL6/J mice received an intravenous oleic acid injection to induce an acute lung injury (ALI). Lung expression patterns were analyzed using a 9900 cDNA mouse microarray (MUSV29K). Our gene-expression analysis revealed marked changes in the immune and inflammatory response metabolic pathways, notably lipid metabolism and transcription. The early stage (1 hour–1.5 hours) is characterized by a pro-inflammatory immune response. Later (3 hours–4 hours), the immune cells migrate into inflamed tissues through interaction with vascular endothelial cells. Finally, at late stages of lung inflammation (18 hours–24 hours), metabolism is deeply disturbed. Highly expressed pro-inflammatory cytokines activate transcription of many genes and lipid metabolism. In this study, we described a global overview of critical events occurring during lung inflammation which is essential to understand infectious pathologies such as sepsis where inflammation and infection are intertwined. Based on these data, it becomes possible to isolate the impact of a pathogen at the transcriptional level from the global gene expression modifications resulting from the infection associated with the inflammation

    Use of Data-Biased Random Walks on Graphs for the Retrieval of Context-Specific Networks from Genomic Data

    Get PDF
    Extracting network-based functional relationships within genomic datasets is an important challenge in the computational analysis of large-scale data. Although many methods, both public and commercial, have been developed, the problem of identifying networks of interactions that are most relevant to the given input data still remains an open issue. Here, we have leveraged the method of random walks on graphs as a powerful platform for scoring network components based on simultaneous assessment of the experimental data as well as local network connectivity. Using this method, NetWalk, we can calculate distribution of Edge Flux values associated with each interaction in the network, which reflects the relevance of interactions based on the experimental data. We show that network-based analyses of genomic data are simpler and more accurate using NetWalk than with some of the currently employed methods. We also present NetWalk analysis of microarray gene expression data from MCF7 cells exposed to different doses of doxorubicin, which reveals a switch-like pattern in the p53 regulated network in cell cycle arrest and apoptosis. Our analyses demonstrate the use of NetWalk as a valuable tool in generating high-confidence hypotheses from high-content genomic data
    corecore